Introducing Symmetries to Black Box Meta Reinforcement Learning
نویسندگان
چکیده
Meta reinforcement learning (RL) attempts to discover new RL algorithms automatically from environment interaction. In so-called black-box approaches, the policy and algorithm are jointly represented by a single neural network. These methods very flexible, but they tend underperform compared human-engineered in terms of generalisation new, unseen environments. this paper, we explore role symmetries meta-generalisation. We show that recent successful meta approach meta-learns an objective for backpropagation-based exhibits certain (specifically reuse rule, invariance input output permutations) not present typical systems. hypothesise these can play important Building off work supervised learning, develop system same symmetries. through careful experimentation incorporating lead with greater ability generalise action & observation spaces, tasks,
منابع مشابه
Policy Improvement: Between Black-Box Optimization and Episodic Reinforcement Learning
Policy improvement methods seek to optimize the parameters of a policy with respect to a utility function. There are two main approaches to performing this optimization: reinforcement learning (RL) and black-box optimization (BBO). In recent years, benchmark comparisons between RL and BBO have been made, and there have been several attempts to specify which approach works best for which types o...
متن کاملFrom Black-Box Learning Objects to Glass-Box Learning Objects
In the field of e-learning, a popular solution to make teaching material reusable is to represent it as learning object (LO). However, building better adaptive educational software also takes an explicit model of the learner’s cognitive process related to LOs. This paper presents a three layers model that explicitly connect the description of learners’ cognitive processes to LOs. The first laye...
متن کاملLearning in a Black Box ∗
Many interactive environments can be represented as games, but they are so large and complex that individual players are mostly in the dark about others’ actions and the payoff structure. This paper analyzes learning behavior in such ‘black box’ environments, where players’ only source of information is their own history of actions taken and payoffs received. The context of our analysis are dec...
متن کاملSome Considerations on Learning to Explore via Meta-Reinforcement Learning
We consider the problem of exploration in meta reinforcement learning. Two new meta reinforcement learning algorithms are suggested: EMAML and E-RL. Results are presented on a novel environment we call ‘Krazy World’ and a set of maze environments. We show E-MAML and E-RL deliver better performance on tasks where exploration is important.
متن کاملPolicy Improvement Methods: Between Black-Box Optimization and Episodic Reinforcement Learning
Policy improvement methods seek to optimize the parameters of a policy with respect to a utility function. There are two main approaches to performing this optimization: reinforcement learning (RL) and black-box optimization (BBO). Whereas BBO algorithms are generic optimization methods that, due to there generality, may also be applied to optimizing policy parameters, RL algorithms are specifi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2022
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v36i7.20681